Overview

Dataset statistics

Number of variables20
Number of observations4913
Missing cells15540
Missing cells (%)15.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory767.8 KiB
Average record size in memory160.0 B

Variable types

Text10
DateTime1
Numeric8
Categorical1

Alerts

budget is highly overall correlated with box_office_total and 1 other fieldsHigh correlation
box_office_total is highly overall correlated with budget and 1 other fieldsHigh correlation
note_presse is highly overall correlated with note_spectateursHigh correlation
note_spectateurs is highly overall correlated with note_presseHigh correlation
boxoffice is highly overall correlated with box_office_totalHigh correlation
type_film is highly overall correlated with budgetHigh correlation
type_film is highly imbalanced (99.5%)Imbalance
date has 305 (6.2%) missing valuesMissing
genre has 1765 (35.9%) missing valuesMissing
distributeur has 127 (2.6%) missing valuesMissing
titre_original has 2915 (59.3%) missing valuesMissing
nationalités has 3552 (72.3%) missing valuesMissing
budget has 2930 (59.6%) missing valuesMissing
box_office_total has 556 (11.3%) missing valuesMissing
note_spectateurs has 330 (6.7%) missing valuesMissing
recompenses has 2495 (50.8%) missing valuesMissing
description has 518 (10.5%) missing valuesMissing

Reproduction

Analysis started2023-07-25 12:47:18.894170
Analysis finished2023-07-25 12:47:33.180361
Duration14.29 seconds
Software versionydata-profiling vv4.3.2
Download configurationconfig.json

Variables

titre
Text

Distinct4508
Distinct (%)91.8%
Missing0
Missing (%)0.0%
Memory size38.5 KiB
2023-07-25T14:47:33.499107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length83
Median length57
Mean length16.206188
Min length1

Characters and Unicode

Total characters79621
Distinct characters113
Distinct categories13 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4223 ?
Unique (%)86.0%

Sample

1st rowLes Vengeances de Maître Poutifard
2nd rowLe Challenge
3rd rowLa Petite sirène
4th rowUn homme idéal
5th rowMission : Impossible - Protocole fantôme
ValueCountFrequency (%)
la 588
 
4.1%
le 530
 
3.7%
de 425
 
2.9%
424
 
2.9%
les 385
 
2.7%
the 256
 
1.8%
et 203
 
1.4%
des 179
 
1.2%
du 178
 
1.2%
à 113
 
0.8%
Other values (5150) 11213
77.4%
2023-07-25T14:47:34.105229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9581
 
12.0%
e 9250
 
11.6%
a 5282
 
6.6%
r 4489
 
5.6%
i 4384
 
5.5%
n 4285
 
5.4%
s 4253
 
5.3%
o 3898
 
4.9%
t 3428
 
4.3%
l 3157
 
4.0%
Other values (103) 27614
34.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 58621
73.6%
Space Separator 9581
 
12.0%
Uppercase Letter 9312
 
11.7%
Other Punctuation 1207
 
1.5%
Decimal Number 502
 
0.6%
Dash Punctuation 298
 
0.4%
Final Punctuation 43
 
0.1%
Open Punctuation 25
 
< 0.1%
Close Punctuation 25
 
< 0.1%
Other Symbol 3
 
< 0.1%
Other values (3) 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 9250
15.8%
a 5282
 
9.0%
r 4489
 
7.7%
i 4384
 
7.5%
n 4285
 
7.3%
s 4253
 
7.3%
o 3898
 
6.6%
t 3428
 
5.8%
l 3157
 
5.4%
u 2815
 
4.8%
Other values (37) 13380
22.8%
Uppercase Letter
ValueCountFrequency (%)
L 1536
16.5%
M 710
 
7.6%
C 639
 
6.9%
S 629
 
6.8%
T 601
 
6.5%
P 554
 
5.9%
A 509
 
5.5%
B 471
 
5.1%
D 449
 
4.8%
R 334
 
3.6%
Other values (23) 2880
30.9%
Other Punctuation
ValueCountFrequency (%)
' 557
46.1%
: 242
20.0%
, 162
 
13.4%
. 118
 
9.8%
! 62
 
5.1%
& 38
 
3.1%
? 19
 
1.6%
/ 3
 
0.2%
3
 
0.2%
# 3
 
0.2%
Decimal Number
ValueCountFrequency (%)
2 149
29.7%
3 80
15.9%
1 75
14.9%
0 48
 
9.6%
4 39
 
7.8%
7 27
 
5.4%
5 24
 
4.8%
8 21
 
4.2%
6 20
 
4.0%
9 19
 
3.8%
Dash Punctuation
ValueCountFrequency (%)
- 291
97.7%
7
 
2.3%
Open Punctuation
ValueCountFrequency (%)
( 24
96.0%
[ 1
 
4.0%
Close Punctuation
ValueCountFrequency (%)
) 24
96.0%
] 1
 
4.0%
Other Symbol
ValueCountFrequency (%)
° 2
66.7%
® 1
33.3%
Space Separator
ValueCountFrequency (%)
9581
100.0%
Final Punctuation
ValueCountFrequency (%)
43
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 2
100.0%
Other Number
ValueCountFrequency (%)
² 1
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 67933
85.3%
Common 11688
 
14.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 9250
13.6%
a 5282
 
7.8%
r 4489
 
6.6%
i 4384
 
6.5%
n 4285
 
6.3%
s 4253
 
6.3%
o 3898
 
5.7%
t 3428
 
5.0%
l 3157
 
4.6%
u 2815
 
4.1%
Other values (70) 22692
33.4%
Common
ValueCountFrequency (%)
9581
82.0%
' 557
 
4.8%
- 291
 
2.5%
: 242
 
2.1%
, 162
 
1.4%
2 149
 
1.3%
. 118
 
1.0%
3 80
 
0.7%
1 75
 
0.6%
! 62
 
0.5%
Other values (23) 371
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 78206
98.2%
None 1362
 
1.7%
Punctuation 53
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9581
 
12.3%
e 9250
 
11.8%
a 5282
 
6.8%
r 4489
 
5.7%
i 4384
 
5.6%
n 4285
 
5.5%
s 4253
 
5.4%
o 3898
 
5.0%
t 3428
 
4.4%
l 3157
 
4.0%
Other values (69) 26199
33.5%
None
ValueCountFrequency (%)
é 726
53.3%
è 219
 
16.1%
à 111
 
8.1%
ê 82
 
6.0%
ô 39
 
2.9%
ï 23
 
1.7%
À 22
 
1.6%
ç 22
 
1.6%
â 20
 
1.5%
É 19
 
1.4%
Other values (21) 79
 
5.8%
Punctuation
ValueCountFrequency (%)
43
81.1%
7
 
13.2%
3
 
5.7%

date
Date

MISSING 

Distinct1292
Distinct (%)28.0%
Missing305
Missing (%)6.2%
Memory size38.5 KiB
Minimum1933-09-29 00:00:00
Maximum2023-12-07 00:00:00
2023-07-25T14:47:34.306288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:34.490467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

genre
Text

MISSING 

Distinct336
Distinct (%)10.7%
Missing1765
Missing (%)35.9%
Memory size38.5 KiB
2023-07-25T14:47:34.716328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length58
Median length40
Mean length12.07878
Min length5

Characters and Unicode

Total characters38024
Distinct characters41
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique154 ?
Unique (%)4.9%

Sample

1st rowFamille
2nd rowFamille,Fantastique
3rd rowEspionnage,Thriller
4th rowHistorique
5th rowDrame
ValueCountFrequency (%)
drame 360
 
10.6%
romance 271
 
8.0%
thriller 259
 
7.6%
comédie 177
 
5.2%
action 153
 
4.5%
fiction 120
 
3.5%
aventure 102
 
3.0%
science 93
 
2.7%
policier 91
 
2.7%
fantastique 87
 
2.6%
Other values (319) 1694
49.7%
2023-07-25T14:47:35.222074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 4867
12.8%
i 3660
 
9.6%
r 3072
 
8.1%
a 2531
 
6.7%
o 2253
 
5.9%
n 2218
 
5.8%
m 2053
 
5.4%
t 2030
 
5.3%
l 1857
 
4.9%
c 1762
 
4.6%
Other values (31) 11721
30.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 31566
83.0%
Uppercase Letter 4622
 
12.2%
Other Punctuation 1454
 
3.8%
Space Separator 259
 
0.7%
Dash Punctuation 123
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 4867
15.4%
i 3660
11.6%
r 3072
9.7%
a 2531
8.0%
o 2253
7.1%
n 2218
7.0%
m 2053
 
6.5%
t 2030
 
6.4%
l 1857
 
5.9%
c 1762
 
5.6%
Other values (13) 5263
16.7%
Uppercase Letter
ValueCountFrequency (%)
A 891
19.3%
D 753
16.3%
F 615
13.3%
T 468
10.1%
C 459
9.9%
R 440
9.5%
P 192
 
4.2%
E 188
 
4.1%
S 174
 
3.8%
H 144
 
3.1%
Other values (5) 298
 
6.4%
Other Punctuation
ValueCountFrequency (%)
, 1454
100.0%
Space Separator
ValueCountFrequency (%)
259
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 123
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 36188
95.2%
Common 1836
 
4.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 4867
13.4%
i 3660
 
10.1%
r 3072
 
8.5%
a 2531
 
7.0%
o 2253
 
6.2%
n 2218
 
6.1%
m 2053
 
5.7%
t 2030
 
5.6%
l 1857
 
5.1%
c 1762
 
4.9%
Other values (28) 9885
27.3%
Common
ValueCountFrequency (%)
, 1454
79.2%
259
 
14.1%
- 123
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 37560
98.8%
None 464
 
1.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 4867
13.0%
i 3660
 
9.7%
r 3072
 
8.2%
a 2531
 
6.7%
o 2253
 
6.0%
n 2218
 
5.9%
m 2053
 
5.5%
t 2030
 
5.4%
l 1857
 
4.9%
c 1762
 
4.7%
Other values (30) 11257
30.0%
None
ValueCountFrequency (%)
é 464
100.0%

durée
Real number (ℝ)

Distinct143
Distinct (%)2.9%
Missing7
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean107.71035
Minimum26
Maximum327
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size38.5 KiB
2023-07-25T14:47:35.588366image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum26
5-th percentile84
Q195
median105
Q3118
95-th percentile142.75
Maximum327
Range301
Interquartile range (IQR)23

Descriptive statistics

Standard deviation19.438643
Coefficient of variation (CV)0.18047144
Kurtosis5.4524366
Mean107.71035
Median Absolute Deviation (MAD)11
Skewness1.1783267
Sum528427
Variance377.86084
MonotonicityNot monotonic
2023-07-25T14:47:35.779968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
90 195
 
4.0%
100 194
 
3.9%
105 170
 
3.5%
95 165
 
3.4%
107 132
 
2.7%
98 129
 
2.6%
110 128
 
2.6%
97 126
 
2.6%
102 119
 
2.4%
93 111
 
2.3%
Other values (133) 3437
70.0%
ValueCountFrequency (%)
26 1
 
< 0.1%
35 3
0.1%
38 1
 
< 0.1%
39 1
 
< 0.1%
40 4
0.1%
41 1
 
< 0.1%
42 1
 
< 0.1%
44 1
 
< 0.1%
45 1
 
< 0.1%
46 1
 
< 0.1%
ValueCountFrequency (%)
327 1
< 0.1%
231 1
< 0.1%
212 1
< 0.1%
201 2
< 0.1%
197 1
< 0.1%
196 1
< 0.1%
194 2
< 0.1%
192 2
< 0.1%
190 1
< 0.1%
189 1
< 0.1%
Distinct2432
Distinct (%)49.5%
Missing0
Missing (%)0.0%
Memory size38.5 KiB
2023-07-25T14:47:36.104221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length27
Mean length13.744759
Min length3

Characters and Unicode

Total characters67528
Distinct characters90
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1423 ?
Unique (%)29.0%

Sample

1st rowPierre-François Martin-Laval
2nd rowGene Stupnitsky
3rd rowRob Marshall
4th rowYann Gozlan
5th rowBrad Bird
ValueCountFrequency (%)
david 112
 
1.1%
john 72
 
0.7%
james 62
 
0.6%
eric 60
 
0.6%
michael 56
 
0.5%
robert 55
 
0.5%
steven 52
 
0.5%
philippe 52
 
0.5%
peter 51
 
0.5%
paul 48
 
0.5%
Other values (3368) 9603
93.9%
2023-07-25T14:47:36.633033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 6247
 
9.3%
a 5899
 
8.7%
5349
 
7.9%
r 4608
 
6.8%
n 4548
 
6.7%
i 4527
 
6.7%
o 3866
 
5.7%
l 3142
 
4.7%
t 2298
 
3.4%
s 2177
 
3.2%
Other values (80) 24867
36.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 51199
75.8%
Uppercase Letter 10528
 
15.6%
Space Separator 5349
 
7.9%
Dash Punctuation 210
 
0.3%
Other Punctuation 155
 
0.2%
Open Punctuation 43
 
0.1%
Close Punctuation 43
 
0.1%
Final Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 6247
12.2%
a 5899
11.5%
r 4608
 
9.0%
n 4548
 
8.9%
i 4527
 
8.8%
o 3866
 
7.6%
l 3142
 
6.1%
t 2298
 
4.5%
s 2177
 
4.3%
h 1797
 
3.5%
Other values (40) 12090
23.6%
Uppercase Letter
ValueCountFrequency (%)
M 884
 
8.4%
S 864
 
8.2%
C 756
 
7.2%
J 751
 
7.1%
A 715
 
6.8%
B 681
 
6.5%
D 601
 
5.7%
L 588
 
5.6%
R 575
 
5.5%
G 554
 
5.3%
Other values (23) 3559
33.8%
Other Punctuation
ValueCountFrequency (%)
. 143
92.3%
' 12
 
7.7%
Space Separator
ValueCountFrequency (%)
5349
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 210
100.0%
Open Punctuation
ValueCountFrequency (%)
( 43
100.0%
Close Punctuation
ValueCountFrequency (%)
) 43
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 61727
91.4%
Common 5801
 
8.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 6247
 
10.1%
a 5899
 
9.6%
r 4608
 
7.5%
n 4548
 
7.4%
i 4527
 
7.3%
o 3866
 
6.3%
l 3142
 
5.1%
t 2298
 
3.7%
s 2177
 
3.5%
h 1797
 
2.9%
Other values (73) 22618
36.6%
Common
ValueCountFrequency (%)
5349
92.2%
- 210
 
3.6%
. 143
 
2.5%
( 43
 
0.7%
) 43
 
0.7%
' 12
 
0.2%
1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 66845
99.0%
None 682
 
1.0%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 6247
 
9.3%
a 5899
 
8.8%
5349
 
8.0%
r 4608
 
6.9%
n 4548
 
6.8%
i 4527
 
6.8%
o 3866
 
5.8%
l 3142
 
4.7%
t 2298
 
3.4%
s 2177
 
3.3%
Other values (48) 24184
36.2%
None
ValueCountFrequency (%)
é 355
52.1%
ç 54
 
7.9%
è 53
 
7.8%
á 38
 
5.6%
ô 33
 
4.8%
ó 31
 
4.5%
ë 19
 
2.8%
ï 16
 
2.3%
ö 10
 
1.5%
ø 8
 
1.2%
Other values (21) 65
 
9.5%
Punctuation
ValueCountFrequency (%)
1
100.0%

distributeur
Text

MISSING 

Distinct281
Distinct (%)5.9%
Missing127
Missing (%)2.6%
Memory size38.5 KiB
2023-07-25T14:47:36.938029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length52
Median length36
Mean length20.793982
Min length5

Characters and Unicode

Total characters99520
Distinct characters76
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique125 ?
Unique (%)2.6%

Sample

1st row UGC Distribution
2nd row Sony Pictures Releasing France
3rd row The Walt Disney Company France
4th row Mars Films
5th row Paramount Pictures France
ValueCountFrequency (%)
france 1269
 
10.4%
distribution 936
 
7.6%
films 718
 
5.9%
pictures 681
 
5.6%
international 406
 
3.3%
metropolitan 321
 
2.6%
filmexport 321
 
2.6%
warner 307
 
2.5%
bros 307
 
2.5%
universal 243
 
2.0%
Other values (306) 6746
55.0%
2023-07-25T14:47:37.441184image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
16806
16.9%
i 7631
 
7.7%
n 7083
 
7.1%
t 7014
 
7.0%
r 6605
 
6.6%
a 6452
 
6.5%
e 6118
 
6.1%
o 5002
 
5.0%
s 3978
 
4.0%
l 3086
 
3.1%
Other values (66) 29745
29.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 67679
68.0%
Space Separator 16806
 
16.9%
Uppercase Letter 13794
 
13.9%
Other Punctuation 418
 
0.4%
Decimal Number 332
 
0.3%
Control 236
 
0.2%
Close Punctuation 125
 
0.1%
Open Punctuation 125
 
0.1%
Dash Punctuation 5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 7631
11.3%
n 7083
10.5%
t 7014
10.4%
r 6605
9.8%
a 6452
9.5%
e 6118
9.0%
o 5002
7.4%
s 3978
 
5.9%
l 3086
 
4.6%
u 3070
 
4.5%
Other values (20) 11640
17.2%
Uppercase Letter
ValueCountFrequency (%)
F 2699
19.6%
P 1514
11.0%
D 1486
10.8%
C 1027
 
7.4%
S 766
 
5.6%
B 705
 
5.1%
M 688
 
5.0%
U 656
 
4.8%
T 616
 
4.5%
W 614
 
4.5%
Other values (16) 3023
21.9%
Decimal Number
ValueCountFrequency (%)
1 93
28.0%
2 93
28.0%
3 33
 
9.9%
4 29
 
8.7%
5 20
 
6.0%
6 16
 
4.8%
8 16
 
4.8%
0 13
 
3.9%
9 10
 
3.0%
7 9
 
2.7%
Other Punctuation
ValueCountFrequency (%)
. 307
73.4%
/ 100
 
23.9%
' 7
 
1.7%
# 3
 
0.7%
! 1
 
0.2%
Space Separator
ValueCountFrequency (%)
16806
100.0%
Control
ValueCountFrequency (%)
236
100.0%
Close Punctuation
ValueCountFrequency (%)
) 125
100.0%
Open Punctuation
ValueCountFrequency (%)
( 125
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 81473
81.9%
Common 18047
 
18.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 7631
 
9.4%
n 7083
 
8.7%
t 7014
 
8.6%
r 6605
 
8.1%
a 6452
 
7.9%
e 6118
 
7.5%
o 5002
 
6.1%
s 3978
 
4.9%
l 3086
 
3.8%
u 3070
 
3.8%
Other values (46) 25434
31.2%
Common
ValueCountFrequency (%)
16806
93.1%
. 307
 
1.7%
236
 
1.3%
) 125
 
0.7%
( 125
 
0.7%
/ 100
 
0.6%
1 93
 
0.5%
2 93
 
0.5%
3 33
 
0.2%
4 29
 
0.2%
Other values (10) 100
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 99092
99.6%
None 428
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
16806
17.0%
i 7631
 
7.7%
n 7083
 
7.1%
t 7014
 
7.1%
r 6605
 
6.7%
a 6452
 
6.5%
e 6118
 
6.2%
o 5002
 
5.0%
s 3978
 
4.0%
l 3086
 
3.1%
Other values (61) 29317
29.6%
None
ValueCountFrequency (%)
é 390
91.1%
ê 27
 
6.3%
è 9
 
2.1%
â 1
 
0.2%
À 1
 
0.2%
Distinct4587
Distinct (%)93.8%
Missing22
Missing (%)0.4%
Memory size38.5 KiB
2023-07-25T14:47:37.752634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length75
Median length64
Mean length42.612145
Min length8

Characters and Unicode

Total characters208416
Distinct characters108
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4334 ?
Unique (%)88.6%

Sample

1st rowChristian Clavier,Isabelle Nanty,Jennie-Anne Walker
2nd rowJennifer Lawrence,Andrew Barth Feldman,Laura Benanti
3rd rowHalle Bailey,Cerise Calixte,Jonah Hauer-King
4th rowPierre Niney,Ana Girardot,André Marcon
5th rowTom Cruise,Jeremy Renner,Simon Pegg
ValueCountFrequency (%)
de 127
 
0.6%
vincent 64
 
0.3%
daniel 62
 
0.3%
tom 60
 
0.3%
jason 44
 
0.2%
michael 44
 
0.2%
robert 43
 
0.2%
gérard 41
 
0.2%
ben 39
 
0.2%
christian 38
 
0.2%
Other values (13011) 19829
97.2%
2023-07-25T14:47:38.294289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 19220
 
9.2%
e 18663
 
9.0%
15534
 
7.5%
n 13733
 
6.6%
i 13692
 
6.6%
r 12424
 
6.0%
o 10740
 
5.2%
l 10027
 
4.8%
, 9762
 
4.7%
t 6867
 
3.3%
Other values (98) 77754
37.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 151485
72.7%
Uppercase Letter 30800
 
14.8%
Space Separator 15534
 
7.5%
Other Punctuation 10001
 
4.8%
Dash Punctuation 481
 
0.2%
Close Punctuation 53
 
< 0.1%
Open Punctuation 53
 
< 0.1%
Decimal Number 6
 
< 0.1%
Final Punctuation 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 19220
12.7%
e 18663
12.3%
n 13733
9.1%
i 13692
9.0%
r 12424
 
8.2%
o 10740
 
7.1%
l 10027
 
6.6%
t 6867
 
4.5%
s 6811
 
4.5%
h 5098
 
3.4%
Other values (49) 34210
22.6%
Uppercase Letter
ValueCountFrequency (%)
M 2711
 
8.8%
C 2344
 
7.6%
B 2285
 
7.4%
J 2255
 
7.3%
S 2080
 
6.8%
A 1995
 
6.5%
D 1889
 
6.1%
L 1758
 
5.7%
R 1638
 
5.3%
G 1378
 
4.5%
Other values (28) 10467
34.0%
Other Punctuation
ValueCountFrequency (%)
, 9762
97.6%
. 164
 
1.6%
' 74
 
0.7%
/ 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 3
50.0%
5 3
50.0%
Space Separator
ValueCountFrequency (%)
15534
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 481
100.0%
Close Punctuation
ValueCountFrequency (%)
) 53
100.0%
Open Punctuation
ValueCountFrequency (%)
( 53
100.0%
Final Punctuation
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 182285
87.5%
Common 26131
 
12.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 19220
 
10.5%
e 18663
 
10.2%
n 13733
 
7.5%
i 13692
 
7.5%
r 12424
 
6.8%
o 10740
 
5.9%
l 10027
 
5.5%
t 6867
 
3.8%
s 6811
 
3.7%
h 5098
 
2.8%
Other values (87) 65010
35.7%
Common
ValueCountFrequency (%)
15534
59.4%
, 9762
37.4%
- 481
 
1.8%
. 164
 
0.6%
' 74
 
0.3%
) 53
 
0.2%
( 53
 
0.2%
3
 
< 0.1%
0 3
 
< 0.1%
5 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 206489
99.1%
None 1924
 
0.9%
Punctuation 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 19220
 
9.3%
e 18663
 
9.0%
15534
 
7.5%
n 13733
 
6.7%
i 13692
 
6.6%
r 12424
 
6.0%
o 10740
 
5.2%
l 10027
 
4.9%
, 9762
 
4.7%
t 6867
 
3.3%
Other values (52) 75827
36.7%
None
ValueCountFrequency (%)
é 1008
52.4%
ï 155
 
8.1%
è 144
 
7.5%
ç 123
 
6.4%
ë 72
 
3.7%
î 59
 
3.1%
á 44
 
2.3%
ô 41
 
2.1%
ü 38
 
2.0%
í 27
 
1.4%
Other values (35) 213
 
11.1%
Punctuation
ValueCountFrequency (%)
3
100.0%

titre_original
Text

MISSING 

Distinct1853
Distinct (%)92.7%
Missing2915
Missing (%)59.3%
Memory size38.5 KiB
2023-07-25T14:47:38.619383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length83
Median length52
Mean length17.63964
Min length1

Characters and Unicode

Total characters35244
Distinct characters113
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1733 ?
Unique (%)86.7%

Sample

1st rowNo Hard Feelings
2nd rowThe Little Mermaid
3rd rowMission: Impossible - Ghost Protocol
4th rowFast X
5th rowA Beautiful Day in the Neighborhood
ValueCountFrequency (%)
the 811
 
12.6%
of 252
 
3.9%
a 79
 
1.2%
78
 
1.2%
and 76
 
1.2%
in 57
 
0.9%
no 55
 
0.9%
to 48
 
0.7%
2 46
 
0.7%
la 26
 
0.4%
Other values (2862) 4896
76.2%
2023-07-25T14:47:39.152226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4426
 
12.6%
e 3575
 
10.1%
a 2342
 
6.6%
o 2044
 
5.8%
n 1974
 
5.6%
i 1840
 
5.2%
r 1789
 
5.1%
t 1669
 
4.7%
h 1464
 
4.2%
s 1346
 
3.8%
Other values (103) 12775
36.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 24731
70.2%
Uppercase Letter 5319
 
15.1%
Space Separator 4426
 
12.6%
Other Punctuation 445
 
1.3%
Decimal Number 204
 
0.6%
Dash Punctuation 102
 
0.3%
Final Punctuation 6
 
< 0.1%
Close Punctuation 5
 
< 0.1%
Open Punctuation 5
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3575
14.5%
a 2342
9.5%
o 2044
 
8.3%
n 1974
 
8.0%
i 1840
 
7.4%
r 1789
 
7.2%
t 1669
 
6.7%
h 1464
 
5.9%
s 1346
 
5.4%
l 1031
 
4.2%
Other values (47) 5657
22.9%
Uppercase Letter
ValueCountFrequency (%)
T 839
15.8%
S 417
 
7.8%
M 336
 
6.3%
D 296
 
5.6%
A 290
 
5.5%
C 272
 
5.1%
P 256
 
4.8%
L 256
 
4.8%
B 253
 
4.8%
H 252
 
4.7%
Other values (19) 1852
34.8%
Other Punctuation
ValueCountFrequency (%)
: 212
47.6%
' 100
22.5%
. 64
 
14.4%
, 30
 
6.7%
& 19
 
4.3%
! 10
 
2.2%
? 6
 
1.3%
2
 
0.4%
# 1
 
0.2%
/ 1
 
0.2%
Decimal Number
ValueCountFrequency (%)
2 62
30.4%
3 38
18.6%
1 30
14.7%
0 30
14.7%
4 10
 
4.9%
5 9
 
4.4%
9 8
 
3.9%
8 7
 
3.4%
7 6
 
2.9%
6 4
 
2.0%
Dash Punctuation
ValueCountFrequency (%)
- 100
98.0%
2
 
2.0%
Space Separator
ValueCountFrequency (%)
4426
100.0%
Final Punctuation
ValueCountFrequency (%)
6
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 30050
85.3%
Common 5194
 
14.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3575
 
11.9%
a 2342
 
7.8%
o 2044
 
6.8%
n 1974
 
6.6%
i 1840
 
6.1%
r 1789
 
6.0%
t 1669
 
5.6%
h 1464
 
4.9%
s 1346
 
4.5%
l 1031
 
3.4%
Other values (76) 10976
36.5%
Common
ValueCountFrequency (%)
4426
85.2%
: 212
 
4.1%
' 100
 
1.9%
- 100
 
1.9%
. 64
 
1.2%
2 62
 
1.2%
3 38
 
0.7%
, 30
 
0.6%
1 30
 
0.6%
0 30
 
0.6%
Other values (17) 102
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35153
99.7%
None 81
 
0.2%
Punctuation 10
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4426
 
12.6%
e 3575
 
10.2%
a 2342
 
6.7%
o 2044
 
5.8%
n 1974
 
5.6%
i 1840
 
5.2%
r 1789
 
5.1%
t 1669
 
4.7%
h 1464
 
4.2%
s 1346
 
3.8%
Other values (66) 12684
36.1%
None
ValueCountFrequency (%)
í 7
 
8.6%
é 7
 
8.6%
ô 6
 
7.4%
ä 5
 
6.2%
ú 4
 
4.9%
ö 4
 
4.9%
á 4
 
4.9%
ó 4
 
4.9%
ø 3
 
3.7%
æ 3
 
3.7%
Other values (24) 34
42.0%
Punctuation
ValueCountFrequency (%)
6
60.0%
2
 
20.0%
2
 
20.0%

nationalités
Text

MISSING 

Distinct552
Distinct (%)40.6%
Missing3552
Missing (%)72.3%
Memory size38.5 KiB
2023-07-25T14:47:39.433198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length86
Median length59
Mean length21.650992
Min length11

Characters and Unicode

Total characters29467
Distinct characters58
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique422 ?
Unique (%)31.0%

Sample

1st rowFrance,Belgique
2nd rowRépublique tchèque,Italie,Slovaquie
3rd rowU.S.A.,Grande-Bretagne
4th rowU.S.A.,Grande-Bretagne
5th rowSuède,Allemagne,France,Danemark
ValueCountFrequency (%)
france,belgique 137
 
9.3%
u.s.a.,grande-bretagne 97
 
6.6%
u.s.a.,allemagne 47
 
3.2%
grande-bretagne,u.s.a 44
 
3.0%
u.s.a.,canada 42
 
2.9%
u.s.a.,australie 27
 
1.8%
france,allemagne 25
 
1.7%
france,canada 24
 
1.6%
du 21
 
1.4%
italie,france 21
 
1.4%
Other values (563) 981
66.9%
2023-07-25T14:47:39.942173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 3962
 
13.4%
a 2863
 
9.7%
n 2400
 
8.1%
, 2109
 
7.2%
r 1896
 
6.4%
. 1815
 
6.2%
l 1259
 
4.3%
g 1217
 
4.1%
A 1014
 
3.4%
i 866
 
2.9%
Other values (48) 10066
34.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 19845
67.3%
Uppercase Letter 5158
 
17.5%
Other Punctuation 3927
 
13.3%
Dash Punctuation 432
 
1.5%
Space Separator 105
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3962
20.0%
a 2863
14.4%
n 2400
12.1%
r 1896
9.6%
l 1259
 
6.3%
g 1217
 
6.1%
i 866
 
4.4%
u 841
 
4.2%
c 815
 
4.1%
d 699
 
3.5%
Other values (20) 3027
15.3%
Uppercase Letter
ValueCountFrequency (%)
A 1014
19.7%
F 744
14.4%
S 728
14.1%
B 710
13.8%
U 621
12.0%
G 382
 
7.4%
C 204
 
4.0%
I 197
 
3.8%
E 93
 
1.8%
L 67
 
1.3%
Other values (13) 398
 
7.7%
Other Punctuation
ValueCountFrequency (%)
, 2109
53.7%
. 1815
46.2%
' 3
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 432
100.0%
Space Separator
ValueCountFrequency (%)
105
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 25003
84.9%
Common 4464
 
15.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3962
15.8%
a 2863
 
11.5%
n 2400
 
9.6%
r 1896
 
7.6%
l 1259
 
5.0%
g 1217
 
4.9%
A 1014
 
4.1%
i 866
 
3.5%
u 841
 
3.4%
c 815
 
3.3%
Other values (43) 7870
31.5%
Common
ValueCountFrequency (%)
, 2109
47.2%
. 1815
40.7%
- 432
 
9.7%
105
 
2.4%
' 3
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 29253
99.3%
None 214
 
0.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 3962
13.5%
a 2863
 
9.8%
n 2400
 
8.2%
, 2109
 
7.2%
r 1896
 
6.5%
. 1815
 
6.2%
l 1259
 
4.3%
g 1217
 
4.2%
A 1014
 
3.5%
i 866
 
3.0%
Other values (43) 9852
33.7%
None
ValueCountFrequency (%)
é 103
48.1%
è 94
43.9%
ï 9
 
4.2%
ë 7
 
3.3%
ô 1
 
0.5%
Distinct374
Distinct (%)7.6%
Missing0
Missing (%)0.0%
Memory size38.5 KiB
2023-07-25T14:47:40.177943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length87
Median length78
Mean length9.6262976
Min length1

Characters and Unicode

Total characters47294
Distinct characters53
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique253 ?
Unique (%)5.1%

Sample

1st rowFrançais
2nd rowAnglais
3rd rowAnglais
4th rowFrançais
5th rowAnglais
ValueCountFrequency (%)
anglais 2920
47.5%
français 1798
29.3%
espagnol 218
 
3.5%
allemand 166
 
2.7%
japonais 156
 
2.5%
italien 115
 
1.9%
arabe 101
 
1.6%
russe 87
 
1.4%
coréen 46
 
0.7%
mandarin 45
 
0.7%
Other values (67) 493
 
8.0%
2023-07-25T14:47:40.611537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 7840
16.6%
n 5812
12.3%
s 5506
11.6%
i 5431
11.5%
l 3681
7.8%
g 3218
6.8%
A 3208
6.8%
r 2197
 
4.6%
F 1823
 
3.9%
ç 1798
 
3.8%
Other values (43) 6780
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 38713
81.9%
Uppercase Letter 6124
 
12.9%
Space Separator 1232
 
2.6%
Other Punctuation 1217
 
2.6%
Dash Punctuation 8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 7840
20.3%
n 5812
15.0%
s 5506
14.2%
i 5431
14.0%
l 3681
9.5%
g 3218
8.3%
r 2197
 
5.7%
ç 1798
 
4.6%
e 738
 
1.9%
o 664
 
1.7%
Other values (17) 1828
 
4.7%
Uppercase Letter
ValueCountFrequency (%)
A 3208
52.4%
F 1823
29.8%
E 220
 
3.6%
J 156
 
2.5%
I 126
 
2.1%
C 104
 
1.7%
R 104
 
1.7%
P 58
 
0.9%
M 56
 
0.9%
H 52
 
0.8%
Other values (13) 217
 
3.5%
Space Separator
ValueCountFrequency (%)
1232
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1217
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 44837
94.8%
Common 2457
 
5.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 7840
17.5%
n 5812
13.0%
s 5506
12.3%
i 5431
12.1%
l 3681
8.2%
g 3218
7.2%
A 3208
7.2%
r 2197
 
4.9%
F 1823
 
4.1%
ç 1798
 
4.0%
Other values (40) 4323
9.6%
Common
ValueCountFrequency (%)
1232
50.1%
, 1217
49.5%
- 8
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 45365
95.9%
None 1929
 
4.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 7840
17.3%
n 5812
12.8%
s 5506
12.1%
i 5431
12.0%
l 3681
8.1%
g 3218
7.1%
A 3208
7.1%
r 2197
 
4.8%
F 1823
 
4.0%
1232
 
2.7%
Other values (39) 5417
11.9%
None
ValueCountFrequency (%)
ç 1798
93.2%
é 105
 
5.4%
ï 14
 
0.7%
è 12
 
0.6%

type_film
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size38.5 KiB
Long-métrage
4910 
Télefilm
 
2
Film à sketches
 
1

Length

Max length15
Median length12
Mean length11.998982
Min length8

Characters and Unicode

Total characters58951
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowLong-métrage
2nd rowLong-métrage
3rd rowLong-métrage
4th rowLong-métrage
5th rowLong-métrage

Common Values

ValueCountFrequency (%)
Long-métrage 4910
99.9%
Télefilm 2
 
< 0.1%
Film à sketches 1
 
< 0.1%

Length

2023-07-25T14:47:40.802480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-25T14:47:40.949050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
long-métrage 4910
99.9%
télefilm 2
 
< 0.1%
film 1
 
< 0.1%
à 1
 
< 0.1%
sketches 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
g 9820
16.7%
e 4914
8.3%
m 4913
8.3%
é 4912
8.3%
t 4911
8.3%
L 4910
8.3%
n 4910
8.3%
- 4910
8.3%
r 4910
8.3%
a 4910
8.3%
Other values (12) 4931
8.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 49126
83.3%
Uppercase Letter 4913
 
8.3%
Dash Punctuation 4910
 
8.3%
Space Separator 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
g 9820
20.0%
e 4914
10.0%
m 4913
10.0%
é 4912
10.0%
t 4911
10.0%
n 4910
10.0%
r 4910
10.0%
a 4910
10.0%
o 4910
10.0%
l 5
 
< 0.1%
Other values (7) 11
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
L 4910
99.9%
T 2
 
< 0.1%
F 1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 4910
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 54039
91.7%
Common 4912
 
8.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
g 9820
18.2%
e 4914
9.1%
m 4913
9.1%
é 4912
9.1%
t 4911
9.1%
L 4910
9.1%
n 4910
9.1%
r 4910
9.1%
a 4910
9.1%
o 4910
9.1%
Other values (10) 19
 
< 0.1%
Common
ValueCountFrequency (%)
- 4910
> 99.9%
2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 54038
91.7%
None 4913
 
8.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
g 9820
18.2%
e 4914
9.1%
m 4913
9.1%
t 4911
9.1%
L 4910
9.1%
n 4910
9.1%
- 4910
9.1%
r 4910
9.1%
a 4910
9.1%
o 4910
9.1%
Other values (10) 20
 
< 0.1%
None
ValueCountFrequency (%)
é 4912
> 99.9%
à 1
 
< 0.1%

annee_production
Real number (ℝ)

Distinct72
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2012.3639
Minimum1933
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size38.5 KiB
2023-07-25T14:47:41.111057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1933
5-th percentile2001
Q12008
median2014
Q32018
95-th percentile2022
Maximum2023
Range90
Interquartile range (IQR)10

Descriptive statistics

Standard deviation8.7383856
Coefficient of variation (CV)0.0043423485
Kurtosis13.550458
Mean2012.3639
Median Absolute Deviation (MAD)5
Skewness-2.5486845
Sum9886744
Variance76.359383
MonotonicityNot monotonic
2023-07-25T14:47:41.304338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2022 329
 
6.7%
2018 300
 
6.1%
2019 282
 
5.7%
2021 270
 
5.5%
2017 269
 
5.5%
2016 263
 
5.4%
2013 243
 
4.9%
2015 235
 
4.8%
2014 235
 
4.8%
2011 216
 
4.4%
Other values (62) 2271
46.2%
ValueCountFrequency (%)
1933 1
 
< 0.1%
1939 1
 
< 0.1%
1940 1
 
< 0.1%
1941 1
 
< 0.1%
1946 2
< 0.1%
1947 1
 
< 0.1%
1950 3
0.1%
1951 2
< 0.1%
1952 1
 
< 0.1%
1953 2
< 0.1%
ValueCountFrequency (%)
2023 130
 
2.6%
2022 329
6.7%
2021 270
5.5%
2020 214
4.4%
2019 282
5.7%
2018 300
6.1%
2017 269
5.5%
2016 263
5.4%
2015 235
4.8%
2014 235
4.8%

budget
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct439
Distinct (%)22.1%
Missing2930
Missing (%)59.6%
Infinite0
Infinite (%)0.0%
Mean69388977
Minimum0.1
Maximum2.704 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size38.5 KiB
2023-07-25T14:47:41.495254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.1
5-th percentile37.1
Q110000000
median30000000
Q370000000
95-th percentile2 × 108
Maximum2.704 × 109
Range2.704 × 109
Interquartile range (IQR)60000000

Descriptive statistics

Standard deviation1.7411925 × 108
Coefficient of variation (CV)2.5093215
Kurtosis91.887769
Mean69388977
Median Absolute Deviation (MAD)24680000
Skewness8.6115065
Sum1.3759834 × 1011
Variance3.0317515 × 1016
MonotonicityNot monotonic
2023-07-25T14:47:41.701879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
30000000 73
 
1.5%
20000000 67
 
1.4%
40000000 57
 
1.2%
25000000 55
 
1.1%
100000000 53
 
1.1%
60000000 52
 
1.1%
50000000 51
 
1.0%
35000000 51
 
1.0%
10000000 51
 
1.0%
15000000 49
 
1.0%
Other values (429) 1424
29.0%
(Missing) 2930
59.6%
ValueCountFrequency (%)
0.1 3
0.1%
0.12 1
 
< 0.1%
0.15 1
 
< 0.1%
0.2 1
 
< 0.1%
0.248 1
 
< 0.1%
0.35 1
 
< 0.1%
0.7 1
 
< 0.1%
0.75 1
 
< 0.1%
0.92 1
 
< 0.1%
1 3
0.1%
ValueCountFrequency (%)
2704000000 1
< 0.1%
2384000000 1
< 0.1%
2320000000 1
< 0.1%
1900000000 2
< 0.1%
1781000000 1
< 0.1%
1674000000 1
< 0.1%
1575000000 1
< 0.1%
1570000000 1
< 0.1%
1502000000 1
< 0.1%
1417000000 1
< 0.1%

box_office_total
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct4259
Distinct (%)97.8%
Missing556
Missing (%)11.3%
Infinite0
Infinite (%)0.0%
Mean710044.91
Minimum0
Maximum20328052
Zeros4
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size38.5 KiB
2023-07-25T14:47:41.895225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile8696.6
Q189808
median282658
Q3774897
95-th percentile3008415
Maximum20328052
Range20328052
Interquartile range (IQR)685089

Descriptive statistics

Standard deviation1251938.8
Coefficient of variation (CV)1.7631825
Kurtosis44.929945
Mean710044.91
Median Absolute Deviation (MAD)238315
Skewness5.0978459
Sum3.0936657 × 109
Variance1.5673506 × 1012
MonotonicityNot monotonic
2023-07-25T14:47:42.105070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 4
 
0.1%
389232 3
 
0.1%
63859 3
 
0.1%
20596 3
 
0.1%
13601 2
 
< 0.1%
534326 2
 
< 0.1%
4503694 2
 
< 0.1%
52099 2
 
< 0.1%
10841 2
 
< 0.1%
220398 2
 
< 0.1%
Other values (4249) 4332
88.2%
(Missing) 556
 
11.3%
ValueCountFrequency (%)
0 4
0.1%
6 1
 
< 0.1%
13 1
 
< 0.1%
20 1
 
< 0.1%
30 1
 
< 0.1%
32 1
 
< 0.1%
35 1
 
< 0.1%
39 1
 
< 0.1%
40 1
 
< 0.1%
43 1
 
< 0.1%
ValueCountFrequency (%)
20328052 1
< 0.1%
19273540 1
< 0.1%
14637879 1
< 0.1%
14194819 1
< 0.1%
14000193 1
< 0.1%
12223535 1
< 0.1%
10401347 1
< 0.1%
10344520 1
< 0.1%
9929337 1
< 0.1%
9751952 2
< 0.1%

note_presse
Real number (ℝ)

HIGH CORRELATION 

Distinct41
Distinct (%)0.8%
Missing3
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean3.1601018
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size38.5 KiB
2023-07-25T14:47:42.300812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q12.7
median3.2
Q33.7
95-th percentile4.2
Maximum5
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.67756498
Coefficient of variation (CV)0.21441239
Kurtosis-0.17251566
Mean3.1601018
Median Absolute Deviation (MAD)0.5
Skewness-0.27730464
Sum15516.1
Variance0.45909431
MonotonicityNot monotonic
2023-07-25T14:47:42.497348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
3.3 293
 
6.0%
3.4 292
 
5.9%
3.7 276
 
5.6%
3.2 274
 
5.6%
3.1 274
 
5.6%
3 269
 
5.5%
3.5 261
 
5.3%
3.6 260
 
5.3%
2.9 245
 
5.0%
3.8 232
 
4.7%
Other values (31) 2234
45.5%
ValueCountFrequency (%)
1 7
 
0.1%
1.1 2
 
< 0.1%
1.2 6
 
0.1%
1.3 7
 
0.1%
1.4 12
 
0.2%
1.5 30
0.6%
1.6 24
0.5%
1.7 31
0.6%
1.8 56
1.1%
1.9 49
1.0%
ValueCountFrequency (%)
5 5
 
0.1%
4.9 8
 
0.2%
4.8 10
 
0.2%
4.7 16
 
0.3%
4.6 20
 
0.4%
4.5 25
 
0.5%
4.4 43
 
0.9%
4.3 63
1.3%
4.2 80
1.6%
4.1 142
2.9%

note_spectateurs
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct38
Distinct (%)0.8%
Missing330
Missing (%)6.7%
Infinite0
Infinite (%)0.0%
Mean3.1559459
Minimum0.8
Maximum4.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size38.5 KiB
2023-07-25T14:47:42.684270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.8
5-th percentile1.9
Q12.6
median3.3
Q33.7
95-th percentile4.2
Maximum4.5
Range3.7
Interquartile range (IQR)1.1

Descriptive statistics

Standard deviation0.7301536
Coefficient of variation (CV)0.23135809
Kurtosis-0.5253124
Mean3.1559459
Median Absolute Deviation (MAD)0.5
Skewness-0.41433067
Sum14463.7
Variance0.53312429
MonotonicityNot monotonic
2023-07-25T14:47:42.867114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
3.5 260
 
5.3%
3.7 248
 
5.0%
3.6 245
 
5.0%
3.8 234
 
4.8%
3.3 229
 
4.7%
3.4 221
 
4.5%
3.9 218
 
4.4%
3.1 201
 
4.1%
3.2 201
 
4.1%
2.7 193
 
3.9%
Other values (28) 2333
47.5%
(Missing) 330
 
6.7%
ValueCountFrequency (%)
0.8 1
 
< 0.1%
0.9 4
 
0.1%
1 4
 
0.1%
1.1 4
 
0.1%
1.2 7
 
0.1%
1.3 12
 
0.2%
1.4 26
0.5%
1.5 28
0.6%
1.6 36
0.7%
1.7 44
0.9%
ValueCountFrequency (%)
4.5 31
 
0.6%
4.4 47
 
1.0%
4.3 107
2.2%
4.2 139
2.8%
4.1 132
2.7%
4 189
3.8%
3.9 218
4.4%
3.8 234
4.8%
3.7 248
5.0%
3.6 245
5.0%

nombre_article
Real number (ℝ)

Distinct302
Distinct (%)6.2%
Missing15
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean49.761944
Minimum1
Maximum2463
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size38.5 KiB
2023-07-25T14:47:43.067698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q115
median21
Q327
95-th percentile157.15
Maximum2463
Range2462
Interquartile range (IQR)12

Descriptive statistics

Standard deviation159.73996
Coefficient of variation (CV)3.2100827
Kurtosis94.328852
Mean49.761944
Median Absolute Deviation (MAD)6
Skewness8.6430899
Sum243734
Variance25516.853
MonotonicityNot monotonic
2023-07-25T14:47:43.260594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20 218
 
4.4%
18 218
 
4.4%
23 216
 
4.4%
19 212
 
4.3%
21 207
 
4.2%
17 205
 
4.2%
24 199
 
4.1%
22 197
 
4.0%
25 193
 
3.9%
16 169
 
3.4%
Other values (292) 2864
58.3%
ValueCountFrequency (%)
1 15
 
0.3%
2 24
 
0.5%
3 39
0.8%
4 62
1.3%
5 77
1.6%
6 81
1.6%
7 84
1.7%
8 74
1.5%
9 92
1.9%
10 95
1.9%
ValueCountFrequency (%)
2463 2
< 0.1%
2448 2
< 0.1%
2423 2
< 0.1%
2042 1
 
< 0.1%
1877 2
< 0.1%
1427 1
 
< 0.1%
1405 1
 
< 0.1%
1363 4
0.1%
1356 2
< 0.1%
1326 1
 
< 0.1%

recompenses
Text

MISSING 

Distinct270
Distinct (%)11.2%
Missing2495
Missing (%)50.8%
Memory size38.5 KiB
2023-07-25T14:47:43.448685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length24
Mean length18.727874
Min length6

Characters and Unicode

Total characters45284
Distinct characters23
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique99 ?
Unique (%)4.1%

Sample

1st row3 nominations
2nd row5 nominations
3rd row9 nominations
4th row5 nominations
5th row11 prix et 33 nominations
ValueCountFrequency (%)
nominations 2312
26.1%
prix 1360
15.4%
et 1338
15.1%
2 667
 
7.5%
1 621
 
7.0%
3 475
 
5.4%
4 333
 
3.8%
5 273
 
3.1%
7 206
 
2.3%
6 198
 
2.2%
Other values (35) 1067
12.1%
2023-07-25T14:47:44.006530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 7188
15.9%
6432
14.2%
i 6152
13.6%
o 4792
10.6%
t 3734
8.2%
m 2396
 
5.3%
a 2396
 
5.3%
s 2312
 
5.1%
x 1360
 
3.0%
p 1360
 
3.0%
Other values (13) 7162
15.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 34388
75.9%
Space Separator 6432
 
14.2%
Decimal Number 4442
 
9.8%
Other Punctuation 22
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 7188
20.9%
i 6152
17.9%
o 4792
13.9%
t 3734
10.9%
m 2396
 
7.0%
a 2396
 
7.0%
s 2312
 
6.7%
x 1360
 
4.0%
p 1360
 
4.0%
r 1360
 
4.0%
Decimal Number
ValueCountFrequency (%)
1 1289
29.0%
2 886
19.9%
3 583
13.1%
4 399
 
9.0%
5 323
 
7.3%
6 262
 
5.9%
7 244
 
5.5%
8 154
 
3.5%
9 154
 
3.5%
0 148
 
3.3%
Space Separator
ValueCountFrequency (%)
6432
100.0%
Other Punctuation
ValueCountFrequency (%)
# 22
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 34388
75.9%
Common 10896
 
24.1%

Most frequent character per script

Common
ValueCountFrequency (%)
6432
59.0%
1 1289
 
11.8%
2 886
 
8.1%
3 583
 
5.4%
4 399
 
3.7%
5 323
 
3.0%
6 262
 
2.4%
7 244
 
2.2%
8 154
 
1.4%
9 154
 
1.4%
Other values (2) 170
 
1.6%
Latin
ValueCountFrequency (%)
n 7188
20.9%
i 6152
17.9%
o 4792
13.9%
t 3734
10.9%
m 2396
 
7.0%
a 2396
 
7.0%
s 2312
 
6.7%
x 1360
 
4.0%
p 1360
 
4.0%
r 1360
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 45284
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 7188
15.9%
6432
14.2%
i 6152
13.6%
o 4792
10.6%
t 3734
8.2%
m 2396
 
5.3%
a 2396
 
5.3%
s 2312
 
5.1%
x 1360
 
3.0%
p 1360
 
3.0%
Other values (13) 7162
15.8%

description
Text

MISSING 

Distinct4185
Distinct (%)95.2%
Missing518
Missing (%)10.5%
Memory size38.5 KiB
2023-07-25T14:47:44.373821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length1855
Median length687
Mean length378.20796
Min length1

Characters and Unicode

Total characters1662224
Distinct characters143
Distinct categories17 ?
Distinct scripts3 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3990 ?
Unique (%)90.8%

Sample

1st rowInstituteur à la retraite, Robert Poutifard n'a plus qu'une idée en tête : se venger de ses anciens élèves qui ont gâché sa vie ! Pour l’aider à mettre en place son plan diabolique, il a la meilleure des complices à ses côtés... sa maman. Ensemble, ils vont leur en faire voir de toutes les couleurs ! La vengeance est un plat qui se mange froid, et Robert Poutifard leur prépare une vraie surprise du chef.
2nd rowMaddie est sur le point de perdre sa maison d’enfance et elle pense avoir trouvé la solution à ses problèmes financiers lorsqu’elle tombe sur une offre d’emploi intrigante : parents fortunés cherchent quelqu’un pour emmener Percy, leur fils introverti de 19 ans, dans une série de « dates » afin de le décoincer avant qu’il ne parte pour l’université. A la grande surprise de Maddie, Percy rend ce challenge plus compliqué que prévu et le temps est compté. Elle a un été pour relever ce challenge ou se retrouver sans toit.
3rd rowLes années 1830, dans les eaux d'une île fictive des Caraïbes. Ariel, la benjamine des filles du roi Triton, est une jeune sirène belle et fougueuse dotée d’un tempérament d’aventurière. Rebelle dans l’âme, elle n’a de cesse d’être attirée par le monde qui existe par-delà les flots. Au détour de ses escapades à la surface, elle va tomber sous le charme du prince Eric. Alors qu'il est interdit aux sirènes d'interagir avec les humains, Ariel sent pourtant qu’elle doit suivre son cœur. Elle conclut alors un accord avec Ursula, la terrible sorcière des mers, qui lui octroie le pouvoir de vivre sur la terre ferme, mais sans se douter que ce pacte met sa vie - et la couronne de son père - en danger...
4th rowMathieu, 25 ans, aspire depuis toujours à devenir un auteur reconnu. Un rêve qui lui semble inaccessible car malgré tous ses efforts, il n’a jamais réussi à être édité. En attendant, il gagne sa vie en travaillant chez son oncle qui dirige une société de déménagement…
5th rowImpliquée dans l'attentat terroriste du Kremlin, l'agence Mission Impossible (IMF) est totalement discréditée. Tandis que le président lance l'opération "Protocole Fantôme", Ethan Hunt, privé de ressources et de renfort, doit trouver le moyen de blanchir l'agence et de déjouer toute nouvelle tentative d'attentat. Mais pour compliquer encore la situation, l'agent doit s'engager dans cette mission avec une équipe de fugitifs d'IMF dont il n'a pas bien cerné les motivations…
ValueCountFrequency (%)
de 13946
 
5.1%
la 7473
 
2.7%
et 7203
 
2.6%
à 5823
 
2.1%
le 5728
 
2.1%
un 4771
 
1.7%
les 4235
 
1.5%
une 3723
 
1.4%
son 3312
 
1.2%
dans 3103
 
1.1%
Other values (28226) 216453
78.5%
2023-07-25T14:47:44.972223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
270652
16.3%
e 199780
12.0%
s 100676
 
6.1%
a 98695
 
5.9%
n 98371
 
5.9%
r 97557
 
5.9%
i 88154
 
5.3%
t 83787
 
5.0%
u 79971
 
4.8%
l 72943
 
4.4%
Other values (133) 471638
28.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1292790
77.8%
Space Separator 271417
 
16.3%
Other Punctuation 46739
 
2.8%
Uppercase Letter 35956
 
2.2%
Final Punctuation 7828
 
0.5%
Decimal Number 4396
 
0.3%
Dash Punctuation 2476
 
0.1%
Nonspacing Mark 227
 
< 0.1%
Initial Punctuation 171
 
< 0.1%
Open Punctuation 101
 
< 0.1%
Other values (7) 123
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 199780
15.5%
s 100676
 
7.8%
a 98695
 
7.6%
n 98371
 
7.6%
r 97557
 
7.5%
i 88154
 
6.8%
t 83787
 
6.5%
u 79971
 
6.2%
l 72943
 
5.6%
o 67399
 
5.2%
Other values (44) 305457
23.6%
Uppercase Letter
ValueCountFrequency (%)
A 3271
 
9.1%
L 3251
 
9.0%
M 3215
 
8.9%
C 2669
 
7.4%
S 2581
 
7.2%
P 2210
 
6.1%
E 1977
 
5.5%
D 1918
 
5.3%
I 1581
 
4.4%
B 1545
 
4.3%
Other values (24) 11738
32.6%
Other Punctuation
ValueCountFrequency (%)
, 20184
43.2%
. 15416
33.0%
' 7580
 
16.2%
: 1053
 
2.3%
911
 
1.9%
" 689
 
1.5%
! 401
 
0.9%
? 385
 
0.8%
; 88
 
0.2%
/ 14
 
< 0.1%
Other values (4) 18
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 1030
23.4%
0 732
16.7%
9 587
13.4%
2 420
9.6%
8 305
 
6.9%
3 296
 
6.7%
5 276
 
6.3%
4 272
 
6.2%
7 261
 
5.9%
6 217
 
4.9%
Nonspacing Mark
ValueCountFrequency (%)
́ 129
56.8%
̀ 71
31.3%
̂ 19
 
8.4%
̈ 4
 
1.8%
̧ 4
 
1.8%
Space Separator
ValueCountFrequency (%)
270652
99.7%
  762
 
0.3%
2
 
< 0.1%
1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 2355
95.1%
118
 
4.8%
2
 
0.1%
1
 
< 0.1%
Final Punctuation
ValueCountFrequency (%)
7661
97.9%
» 159
 
2.0%
8
 
0.1%
Initial Punctuation
ValueCountFrequency (%)
« 159
93.0%
8
 
4.7%
4
 
2.3%
Other Symbol
ValueCountFrequency (%)
° 8
80.0%
® 1
 
10.0%
1
 
10.0%
Modifier Letter
ValueCountFrequency (%)
ʼ 2
50.0%
1
25.0%
ʳ 1
25.0%
Open Punctuation
ValueCountFrequency (%)
( 101
100.0%
Close Punctuation
ValueCountFrequency (%)
) 100
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 3
100.0%
Format
ValueCountFrequency (%)
2
100.0%
Math Symbol
ValueCountFrequency (%)
+ 2
100.0%
Other Number
ValueCountFrequency (%)
¹ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1328748
79.9%
Common 333249
 
20.0%
Inherited 227
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 199780
15.0%
s 100676
 
7.6%
a 98695
 
7.4%
n 98371
 
7.4%
r 97557
 
7.3%
i 88154
 
6.6%
t 83787
 
6.3%
u 79971
 
6.0%
l 72943
 
5.5%
o 67399
 
5.1%
Other values (80) 341415
25.7%
Common
ValueCountFrequency (%)
270652
81.2%
, 20184
 
6.1%
. 15416
 
4.6%
7661
 
2.3%
' 7580
 
2.3%
- 2355
 
0.7%
: 1053
 
0.3%
1 1030
 
0.3%
911
 
0.3%
  762
 
0.2%
Other values (38) 5645
 
1.7%
Inherited
ValueCountFrequency (%)
́ 129
56.8%
̀ 71
31.3%
̂ 19
 
8.4%
̈ 4
 
1.8%
̧ 4
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1606232
96.6%
None 47042
 
2.8%
Punctuation 8718
 
0.5%
Diacriticals 227
 
< 0.1%
Modifier Letters 3
 
< 0.1%
Phonetic Ext 1
 
< 0.1%
Letterlike Symbols 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
270652
16.9%
e 199780
12.4%
s 100676
 
6.3%
a 98695
 
6.1%
n 98371
 
6.1%
r 97557
 
6.1%
i 88154
 
5.5%
t 83787
 
5.2%
u 79971
 
5.0%
l 72943
 
4.5%
Other values (70) 415646
25.9%
None
ValueCountFrequency (%)
é 27445
58.3%
à 6158
 
13.1%
è 5389
 
11.5%
ê 2531
 
5.4%
î 803
 
1.7%
ô 798
 
1.7%
  762
 
1.6%
ç 622
 
1.3%
ù 604
 
1.3%
â 476
 
1.0%
Other values (33) 1454
 
3.1%
Punctuation
ValueCountFrequency (%)
7661
87.9%
911
 
10.4%
118
 
1.4%
8
 
0.1%
8
 
0.1%
4
 
< 0.1%
2
 
< 0.1%
2
 
< 0.1%
2
 
< 0.1%
1
 
< 0.1%
Diacriticals
ValueCountFrequency (%)
́ 129
56.8%
̀ 71
31.3%
̂ 19
 
8.4%
̈ 4
 
1.8%
̧ 4
 
1.8%
Modifier Letters
ValueCountFrequency (%)
ʼ 2
66.7%
ʳ 1
33.3%
Phonetic Ext
ValueCountFrequency (%)
1
100.0%
Letterlike Symbols
ValueCountFrequency (%)
1
100.0%

boxoffice
Real number (ℝ)

HIGH CORRELATION 

Distinct4525
Distinct (%)92.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean817863.82
Minimum18
Maximum1.55 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size38.5 KiB
2023-07-25T14:47:45.176882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum18
5-th percentile5295.2
Q147558
median141458
Q3362932
95-th percentile2759818.4
Maximum1.55 × 108
Range1.5499998 × 108
Interquartile range (IQR)315374

Descriptive statistics

Standard deviation3772079.9
Coefficient of variation (CV)4.6121125
Kurtosis635.72831
Mean817863.82
Median Absolute Deviation (MAD)115170
Skewness19.258206
Sum4.0181649 × 109
Variance1.4228587 × 1013
MonotonicityNot monotonic
2023-07-25T14:47:45.379392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
127909 4
 
0.1%
38457 4
 
0.1%
3252896 4
 
0.1%
210904 4
 
0.1%
559378 4
 
0.1%
5764 3
 
0.1%
250597 3
 
0.1%
13099 3
 
0.1%
1576425 3
 
0.1%
6597 3
 
0.1%
Other values (4515) 4878
99.3%
ValueCountFrequency (%)
18 1
 
< 0.1%
28 1
 
< 0.1%
31 1
 
< 0.1%
36 1
 
< 0.1%
37 1
 
< 0.1%
38 1
 
< 0.1%
40 3
0.1%
44 1
 
< 0.1%
45 1
 
< 0.1%
57 1
 
< 0.1%
ValueCountFrequency (%)
155000000 1
< 0.1%
80500000 1
< 0.1%
56200000 1
< 0.1%
47606480 1
< 0.1%
40078000 1
< 0.1%
34119372 1
< 0.1%
33636303 1
< 0.1%
30940732 1
< 0.1%
28416365 1
< 0.1%
27487144 1
< 0.1%

Interactions

2023-07-25T14:47:30.958233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:22.528068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:23.692350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:24.809361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:25.946854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:27.185930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:28.487768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:29.663616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:31.100001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:22.699651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:23.822115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:24.965239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:26.102737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:27.320970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:28.633040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:29.789543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:31.244701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:22.823518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:23.947004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:25.103929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:26.268442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:27.458483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:28.780817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:29.914027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:31.379674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:22.956757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:24.075768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:25.245246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:26.418182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:27.597899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:28.929206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:30.047622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:31.547364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:23.109014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:24.231821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:25.392746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:26.585343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:27.752547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:29.087556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:30.220303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:31.697813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:23.264548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:24.372613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:25.533129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:26.743155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:28.039058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:29.233080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:30.366558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:31.850657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:23.413186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:24.515280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:25.677292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:26.890032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:28.194514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:29.374111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:30.550238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:31.990041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:23.541711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:24.646369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:25.805572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:27.032298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:28.332510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:29.510672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-25T14:47:30.705628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-07-25T14:47:45.525700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
duréeannee_productionbudgetbox_office_totalnote_pressenote_spectateursnombre_articleboxofficetype_film
durée1.000-0.0470.2950.1760.1660.3510.1620.1780.499
annee_production-0.0471.000-0.223-0.311-0.0010.0660.061-0.3170.070
budget0.295-0.2231.0000.508-0.0660.0080.0000.4631.000
box_office_total0.176-0.3110.5081.0000.0230.0990.2090.9080.000
note_presse0.166-0.001-0.0660.0231.0000.6070.469-0.0400.017
note_spectateurs0.3510.0660.0080.0990.6071.0000.3990.0270.028
nombre_article0.1620.0610.0000.2090.4690.3991.0000.1410.000
boxoffice0.178-0.3170.4630.908-0.0400.0270.1411.0000.000
type_film0.4990.0701.0000.0000.0170.0280.0000.0001.000

Missing values

2023-07-25T14:47:32.266511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-07-25T14:47:32.648593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-07-25T14:47:32.971765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

titredategenreduréeréalisateurdistributeuracteurstitre_originalnationalitéslangue_d_originetype_filmannee_productionbudgetbox_office_totalnote_pressenote_spectateursnombre_articlerecompensesdescriptionboxoffice
0Les Vengeances de Maître Poutifard2023-06-28Famille90.0Pierre-François Martin-LavalUGC DistributionChristian Clavier,Isabelle Nanty,Jennie-Anne WalkerNaNFrance,BelgiqueFrançaisLong-métrage2023NaN360119.02.02.15.0NaNInstituteur à la retraite, Robert Poutifard n'a plus qu'une idée en tête : se venger de ses anciens élèves qui ont gâché sa vie ! Pour l’aider à mettre en place son plan diabolique, il a la meilleure des complices à ses côtés... sa maman. Ensemble, ils vont leur en faire voir de toutes les couleurs ! La vengeance est un plat qui se mange froid, et Robert Poutifard leur prépare une vraie surprise du chef.180735
1Le Challenge2023-06-21NaN103.0Gene StupnitskySony Pictures Releasing FranceJennifer Lawrence,Andrew Barth Feldman,Laura BenantiNo Hard FeelingsNaNAnglaisLong-métrage2023NaN116813.03.03.15.0NaNMaddie est sur le point de perdre sa maison d’enfance et elle pense avoir trouvé la solution à ses problèmes financiers lorsqu’elle tombe sur une offre d’emploi intrigante : parents fortunés cherchent quelqu’un pour emmener Percy, leur fils introverti de 19 ans, dans une série de « dates » afin de le décoincer avant qu’il ne parte pour l’université. A la grande surprise de Maddie, Percy rend ce challenge plus compliqué que prévu et le temps est compté. Elle a un été pour relever ce challenge ou se retrouver sans toit.54361
2La Petite sirène2023-05-24Famille,Fantastique136.0Rob MarshallThe Walt Disney Company FranceHalle Bailey,Cerise Calixte,Jonah Hauer-KingThe Little MermaidNaNAnglaisLong-métrage2023NaN1688076.02.52.320.0NaNLes années 1830, dans les eaux d'une île fictive des Caraïbes. Ariel, la benjamine des filles du roi Triton, est une jeune sirène belle et fougueuse dotée d’un tempérament d’aventurière. Rebelle dans l’âme, elle n’a de cesse d’être attirée par le monde qui existe par-delà les flots. Au détour de ses escapades à la surface, elle va tomber sous le charme du prince Eric. Alors qu'il est interdit aux sirènes d'interagir avec les humains, Ariel sent pourtant qu’elle doit suivre son cœur. Elle conclut alors un accord avec Ursula, la terrible sorcière des mers, qui lui octroie le pouvoir de vivre sur la terre ferme, mais sans se douter que ce pacte met sa vie - et la couronne de son père - en danger...582814
3Un homme idéal2015-03-18NaN97.0Yann GozlanMars FilmsPierre Niney,Ana Girardot,André MarconNaNNaNFrançaisLong-métrage2014NaN666835.03.23.824.0NaNMathieu, 25 ans, aspire depuis toujours à devenir un auteur reconnu. Un rêve qui lui semble inaccessible car malgré tous ses efforts, il n’a jamais réussi à être édité. En attendant, il gagne sa vie en travaillant chez son oncle qui dirige une société de déménagement…337525
4Mission : Impossible - Protocole fantôme2011-12-14Espionnage,Thriller133.0Brad BirdParamount Pictures FranceTom Cruise,Jeremy Renner,Simon PeggMission: Impossible - Ghost ProtocolNaNAnglaisLong-métrage2011140000000.02414279.03.93.826.03 nominationsImpliquée dans l'attentat terroriste du Kremlin, l'agence Mission Impossible (IMF) est totalement discréditée. Tandis que le président lance l'opération "Protocole Fantôme", Ethan Hunt, privé de ressources et de renfort, doit trouver le moyen de blanchir l'agence et de déjouer toute nouvelle tentative d'attentat. Mais pour compliquer encore la situation, l'agent doit s'engager dans cette mission avec une équipe de fugitifs d'IMF dont il n'a pas bien cerné les motivations…827046
5Il Boemo2023-06-21Historique140.0Petr VaclavNour FilmsVojtěch Dyk,Barbara Ronchi,Elena RadonicichNaNRépublique tchèque,Italie,SlovaquieItalienLong-métrage2022NaN87911.04.04.019.0NaN1764. Dans une Venise libertine, le musicien et compositeur Josef Myslivecek, surnommé « Il Boemo », ne parvient pas à percer malgré son talent. Sa liaison avec une femme de la cour lui permet d’accéder à son rêve et de composer un opéra. Dès lors sa renommée grandit, mais jusqu’où ira-t-il ? La vie, l’œuvre et les frasques d’un compositeur de génie oublié que le jeune Mozart admirait.34803
6Fast & Furious X2023-05-17NaN141.0Louis LeterrierUniversal Pictures International FranceVin Diesel,Michelle Rodriguez,Jason MomoaFast XNaNAnglaisLong-métrage2023NaN2294174.02.93.115.0NaNAprès bien des missions et contre toute attente, Dom Toretto et sa famille ont su déjouer, devancer, surpasser et distancer tous les adversaires qui ont croisé leur route. Ils sont aujourd’hui face à leur ennemi le plus terrifiant et le plus intime : émergeant des brumes du passé, ce revenant assoiffé de vengeance est bien déterminé à décimer la famille en réduisant à néant tout ce à quoi, et surtout à qui Dom ait jamais tenu.1140846
7L'Extraordinaire Mr. RogersNaNDrame109.0Marielle HellerSony Pictures Releasing FranceTom Hanks,Matthew Rhys,Susan Kelechi WatsonA Beautiful Day in the NeighborhoodNaNAnglaisLong-métrage2019NaNNaN3.33.14.05 nominationsL'histoire de Fred Rogers, un homme de télé américain dont le programme éducatif13251238
8Beau Is Afraid2023-04-26Aventure,Drame179.0Ari AsterARP SélectionJoaquin Phoenix,Nathan Lane,Amy RyanNaNNaNAnglaisLong-métrage2023NaN70862.03.63.331.0NaNBeau tente désespérément de rejoindre sa mère. Mais l’univers semble se liguer contre lui…35019
9En corps2022-03-30Drame,Comédie118.0Cédric KlapischStudioCanalMarion Barbeau,Hofesh Shechter,Denis PodalydèsNaNNaNFrançaisLong-métrage2022778.01385302.03.44.133.09 nominationsElise, 26 ans est une grande danseuse classique. Elle se blesse pendant un spectacle et apprend qu’elle ne pourra plus danser. Dès lors sa vie va être bouleversée, Elise va devoir apprendre à se réparer… Entre Paris et la Bretagne, au gré des rencontres et des expériences, des déceptions et des espoirs, Elise va se rapprocher d’une compagnie de danse contemporaine. Cette nouvelle façon de danser va lui permettre de retrouver un nouvel élan et aussi une nouvelle façon de vivre.332971
titredategenreduréeréalisateurdistributeuracteurstitre_originalnationalitéslangue_d_originetype_filmannee_productionbudgetbox_office_totalnote_pressenote_spectateursnombre_articlerecompensesdescriptionboxoffice
4903Alabama Monroe2013-08-28NaN109.0Felix Van GroeningenBodega Films / Help! DistributionJohan Heldenbergh,Veerle Baetens,Nell CattrysseThe Broken Circle BreakdownNaNFlamand, AnglaisLong-métrage2012NaNNaN3.74.321.02 prix et 9 nominationsNaN4148
4904Le Château dans le cielNaNAventure,Fantastique,Famille124.0Hayao Miyazaki\n615 374 entrées\nMayumi Tanaka,Jim Cummings,Hiroshi ItoTenku no shiro RapyutaNaNJaponaisLong-métrage1986NaNNaN5.04.3455.0NaNQui est vraiment Sheeta, la petite fille porteuse d’une pierre en pendentif aux pouvoirs magiques qui suscite bien des convoitises ? Retenue prisonnière à bord d’un dirigeable, l’enfant affronte une bande de pirates de l’air menée par la très pittoresque Dora, puis une armée de militaires à la solde de Muska, un gentleman machiavélique trop poli pour être honnête. Sauvée par le jeune Pazu, Sheeta se réfugie dans un village de mineurs. Là, elle tentera avec le garçon de percer le secret de ses origines pour prouver que l’histoire de Laputa, l’île merveilleuse flottant dans les airs, n’est pas une légende…227573
4905Avengers: Infinity War2018-04-25Action,Science fiction156.0Joe RussoThe Walt Disney Company FranceRobert Downey Jr.,Chris Hemsworth,Mark RuffaloNaNNaNAnglaisLong-métrage2018NaNNaN3.44.321.03 nominationsLes Avengers et leurs alliés devront être prêts à tout sacrifier pour neutraliser le redoutable Thanos avant que son attaque éclair ne conduise à la destruction complète de l’univers.2565953
4906Une merveilleuse histoire du temps2015-01-21Drame123.0James MarshUniversal Pictures International FranceEddie Redmayne,Felicity Jones,Tom PriorThe Theory of EverythingNaNAnglais, FrançaisLong-métrage2014NaNNaN3.34.227.07 prix et 16 nominations1963, en Angleterre, Stephen, brillant étudiant en Cosmologie à l’Université de Cambridge, entend bien donner une réponse simple et efficace au mystère de la création de l’univers. De nouveaux horizons s’ouvrent quand il tombe amoureux d’une étudiante en art, Jane Wilde. Mais le jeune homme, alors dans la fleur de l’âge, se heurte à un diagnostic implacable : une dystrophie neuromusculaire plus connue sous le nom de maladie de Charcot va s’attaquer à ses membres, sa motricité, et son élocution, et finira par le tuer en l’espace de deux ans.97907
4907HEIMAT II – L’exode2013-10-23Historique231.0Edgar ReitzLes Films du LosangeJan Dieter Schneider,Antonia Bill,Maximilian ScheidtDie andere Heimat - Chronik einer Sehnsucht (part 2)Allemagne,FranceAllemandLong-métrage2013NaNNaN4.24.215.0NaNNaN30068
4908Star Wars : Episode III - La Revanche des Sith2005-05-18Action,Aventure140.0George Lucas20th Century StudiosHayden Christensen,Ewan McGregor,Natalie PortmanStar Wars: Episode III - Revenge of the SithNaNAnglaisLong-métrage2005115000000.0NaN3.94.223.05 prix et 13 nominationsLa Guerre des Clones fait rage. Une franche hostilité oppose désormais le Chancelier Palpatine au Conseil Jedi. Anakin Skywalker, jeune Chevalier Jedi pris entre deux feux, hésite sur la conduite à tenir. Séduit par la promesse d'un pouvoir sans précédent, tenté par le côté obscur de la Force, il prête allégeance au maléfique Darth Sidious et devient Dark Vador.3303005
4909Spider-Man 32007-01-05Action139.0Sam RaimiGaumont Columbia Tristar FilmsTobey Maguire,Kirsten Dunst,James FrancoNaNNaNAnglaisLong-métrage2007258000000.0NaN3.73.527.01 prix et 13 nominationsPeter Parker a enfin réussi à concilier son amour pour Mary-Jane et ses devoirs de super-héros. Mais l'horizon s'obscurcit. La brutale mutation de son costume, qui devient noir, décuple ses pouvoirs et transforme également sa personnalité pour laisser ressortir l'aspect sombre et vengeur que Peter s'efforce de contrôler.2778533
4910Taxi 32003-01-29Comédie90.0Gérard KrawczykARP SélectionFrédéric Diefenthal,Samy Naceri,Marion CotillardNaNNaNFrançaisLong-métrage200214490000.0NaN2.81.58.0NaNMarseille, à l'approche de Noël. Daniel ne cesse de rajouter des gadgets à son taxi. Au point de faire passer son bolide avant sa compagne, Lilly, qui décide de retourner vivre chez ses parents. Petra, elle, reproche à Emilien d'avoir la tête ailleurs. Celui-ci enrage en effet de ne pas avoir encore arrêté le gang des pères Noël, qui multiplie les braquages depuis huit mois.2251493
4911Spider-Man2002-12-06Action121.0Sam RaimiColumbia TriStar FilmsTobey Maguire,Willem Dafoe,Kirsten DunstNaNNaNAnglaisLong-métrage2002139000000.0NaN4.24.025.04 nominationsOrphelin, Peter Parker est élevé par sa tante May et son oncle Ben dans le quartier Queens de New York. Tout en poursuivant ses études à l'université, il trouve un emploi de photographe au journal1903136
4912Arthur et les Minimoys2006-11-29Aventure,Fantastique,Famille94.0Luc BessonEuropaCorp DistributionFreddie Highmore,Mia Farrow,Mylène FarmerNaNNaNAnglaisLong-métrage200665200000.0NaN3.33.025.02 prix et 3 nominationsComme tous les enfants de son âge, Arthur est fasciné par les histoires que lui raconte sa grand-mère pour l'endormir : ses rêves sont peuplés de tribus africaines et d'inventions incroyables, tirées d'un vieux grimoire, souvenir de son grand-père mystérieusement disparu depuis quatre ans.1510369